Robustness in Markov Decision Problems with Uncertain Transition Matrices
نویسندگان
چکیده
Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to realworld problems. We propose an algorithm for solving finite-state and finite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our algorithm involves a statistically accurate yet numerically efficient representation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the original Bellman recursion. Hence, robustness can be added at practically no extra computing cost.
منابع مشابه
Robust Markov Decision Processes with Uncertain Transition Matrices
Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov dec...
متن کاملRobust Control of Markov Decision Processes with Uncertain Transition Matrices
Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov dec...
متن کاملA Markov Model for Performance Evaluation of Coal Handling Unit of a Thermal Power Plant
The present paper discusses the development of a Markov model for performance evaluation of coal handling unit of a thermal power plant using probabilistic approach. Coal handling unit ensures proper supply of coal for sound functioning of thermal Power Plant. In present paper, the coal handling unit consists of two subsystems with two possible states i.e. working and failed. Failure and repair...
متن کاملA fuzzy approach to Markov decision processes with uncertain transition probabilities
In this paper, a Markov decision model with uncertain transition matrices, which allow a matrix to fluctuate at each step in time, is described by the use of fuzzy sets. We find a pareto optimal policy maximizing the infinite horizon fuzzy expected discounted reward over all stationary policies under some partial order. The pareto optimal policies are characterized by maximal solutions of an op...
متن کاملA fuzzy treatment of uncertain Markov decision processes: Average case
In this paper, the uncertain transition matrices for inhomogeneous Markov decision processes are described by use of fuzzy sets. Introducing a ν-step contractive property, called a minorization condition, for the average case, we fined a Pareto optimal policy maximizing the average expected fuzzy rewards under some partial order. The Pareto optimal policies are characterized by maximal solution...
متن کامل